A study of the efficiency of pooling in haplotype estimation

نویسندگان

  • Anthony Y. C. Kuk
  • Jinfeng Xu
  • Yaning Yang
چکیده

MOTIVATION It has been claimed in the literature that pooling DNA samples is efficient in estimating haplotype frequencies. There is, however, no theoretical justification based on calculation of statistical efficiency. In fact, the limited evidence given so far is based on simulation studies with small numbers of loci. With rapid advance in technology, it is of interest to see if pooling is still efficient when the number of loci increases. METHODS Instead of resorting to simulation studies, we make use of asymptotic statistical theory to perform exact calculation of the efficiency of pooling relative to no pooling in the estimation of haplotype frequencies. As an intermediate step, we use the log-linear formulation of the haplotype probabilities and derive the asymptotic variance-covariance matrix of the maximum likelihood estimators of the canonical parameters of the log-linear model. RESULTS Based on our calculations under linkage equilibrium, pooling can suffer huge loss in efficiency relative to no pooling when there are more than three independent loci and the alleles are not rare. Pooling works better for rare alleles. In particular, if all the minor allele frequencies are 0.05, pooling maintains an advantage over no pooling until the number of independent loci reaches 6. High linkage disequilibrium effectively reduces the number of independent loci by ruling out certain haplotypes from occurring. Similar calculations of efficiency for the case of no pooling justify the common belief that it is not worthwhile to use molecular methods to resolve the phase ambiguity of individual genotype data. AVAILABILITY The R codes for the calculation are available at http://www.stat.nus.edu.sg/∼staxj/pooling CONTACT [email protected].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficiency of single-nucleotide polymorphism haplotype estimation from pooled DNA.

The efficiency of single-nucleotide polymorphism haplotype analysis may be increased by DNA pooling, which can dramatically reduce the number of genotyping assays. We develop a method for obtaining maximum likelihood estimates of haplotype frequencies for different pool sizes, assess the accuracy of these estimates, and show that pooling DNA samples is efficient in estimating haplotype frequenc...

متن کامل

On the use of DNA pooling to estimate haplotype frequencies.

Genome-wide association studies may be necessary to identify genes underlying certain complex diseases. Because such studies can be extremely expensive, DNA pooling has been introduced, as it may greatly reduce the genotyping burden. Parallel to DNA pooling developments, the importance of haplotypes in genetic studies has been amply demonstrated in the literature. However, DNA pooling of a larg...

متن کامل

HAPLOPOOL: improving haplotype frequency estimation through DNA pools and phylogenetic modeling

MOTIVATION The search for genetic variants that are linked to complex diseases such as cancer, Parkinson's;, or Alzheimer's; disease, may lead to better treatments. Since haplotypes can serve as proxies for hidden variants, one method of finding the linked variants is to look for case-control associations between the haplotypes and disease. Finding these associations requires a high-quality est...

متن کامل

A Convolutional Neural Network based on Adaptive Pooling for Classification of Noisy Images

Convolutional neural network is one of the effective methods for classifying images that performs learning using convolutional, pooling and fully-connected layers. All kinds of noise disrupt the operation of this network. Noise images reduce classification accuracy and increase convolutional neural network training time. Noise is an unwanted signal that destroys the original signal. Noise chang...

متن کامل

Early detection of MS in fMRI images using deep learning techniques

Introduction & Objective:MS is a disease of the central nervous system in which the body makes a defensive attack on its tissues. The disease can affect the brain and spinal cord, causing a wide range of potential symptoms, including balance, movement and vision problems. MRI and fMRI images are a very important tool in the diagnosis and treatment of MS. The aim of this study was to provide...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 26 20  شماره 

صفحات  -

تاریخ انتشار 2010